This geospatial statistical model uses routinely collected malaria case data, population data and remotely sensed data, such as open and vegetated water bodies, to estimate population living around open water bodies, expected malaria cases, and standardised morbidity ratio (SMR) of malaria. And ultimately, quantify the association between proximity to larval habitat and malaria risk in health facility catchment areas in Kasungu. The SMR compares the risk of morbidity in a population of interest with that of a standard population. In this case, our interest is to find out whether the number of dry season malaria cases in each catchment area are greater than we would expect given the malaria rate for the entire Kasungu district.

We do this by comparing what we observe (O) with what we would expect (E) if the risk of malaria was equal throughout Kasungu. The SMR statistical notation of catchment i can be written as follows: \[SMR_i = \frac{O_i}{E_i}\]

Buffers around waterbodies are created and then combined with population data in raster format to estimate the proprtion of catcment population living within 1km, 2km and 3km of water bodies. Subsequently, the observed malaria cases are modeled using Poisson regression to find out if living within various distances from water bodies is causing variability in malaria risk in Kasungu district. We hypothesize that the risk of being a case in a catchment is dependent on proximity to water bodies. The data used spans from 2017 to 2020 and was derived from digitized DHIS2 malaria records, accessibility mapping, aggregated population geospatial layer and TropWet tool in Google Earth Engine.

Load packages

Loading the R packages that will be used to read in, view, transform and model the malaria cases and spatial datasets.

library(spatialEco)
library(dplyr)
library(tidyverse)
library(ggplot2)
library(plotly)
library(lubridate)
library(knitr)
library(raster)
library(rgdal)
library(rgeos)
library(sf)
library(sp)
library(tmap)
library(spdep)
library(maptools)
library(gridExtra)
library(grid)
library(exactextractr)
library(DataExplorer)
library(mapview)
`%>%` <- magrittr::`%>%`

Tell R where the data is

here::here()
## [1] "C:/Users/cnkolokosa/Documents/R/upscaled_2021_updated_May/upscaled_2021"

Load datasets

The total dry season malaria cases recorded at health-care facilities in Kasungu from 2017 to 2019 are contained in the KasunguData.csv sourced from https://dhis2.health.gov.mw/. The kasungu_facility_catchments_2004.shp shapefile also contains the population and health information within each health-facility catchment area in Kasungu district.

The aggregated population raster layers for Malawi e.g.,ku_pop_2017_1km_aggregated.tif were downloaded from the Open Spatial and Demographic and Data Research website: https://www.worldpop.org/geodata/country?iso3=MWI. These layers estimate total number of people per grid-cell. The units are number of people per pixel with country totals adjusted to match the corresponding official United Nations population estimates. The datasets were downloaded in Geotiff at a resolution of 1km and are projected in Geographic Coordinate System, WGS84.

The kasungu_water.shpand water_bodies layers contain open and vegetated waterbodies polygons, detected using the Tropical Wetland Unmixing Tool (TropWet). TropWet is a Google Earth Engine hosted toolbox that uses the Landsat archive to map tropical wetlands and can be accessed through: https://www.aber.ac.uk/en/dges/research/earth-observation-laboratory/research/tropwet/

# 2017, 2018 and 2019 dry season malaria cases 
dry_season_malaria_2015_2019 <- read.csv(here::here("data", "dry_season_malaria 2015-2019.csv"))

# Export 2017, 2018 and 2019 dry season malaria cases
write.csv(dry_season_malaria_2015_2019, file = "data/dry_season_malaria_2017_2019.csv")

write.table(dry_season_malaria_2015_2019, 
            file = "dry_season_malaria_2017_2019.txt", 
            sep = ",", 
            quote = FALSE, 
            row.names = FALSE)

# 2020 dry season malaria cases
ku_malaria_2020 <- read.csv(here::here("data", "dry_season_malaria_2020.csv"))


# Merge 2015 to 2019 dry season malaria case data with 2020 data
dry_season_malaria_2017_2020 <- cbind.data.frame(dry_season_malaria_2015_2019, ku_malaria_2020) 
 
dry_season_malaria_2017_2020 <- dry_season_malaria_2017_2020[,c("FID", "Names", "dr_2017", 
                                                                "dr_2018", "dr_2019", "dr_2020",
                                                                "LONGITU", "LATITUD")]

# Export 'dry season malaria 2017-2020' as csv
write.csv(dry_season_malaria_2017_2020, file = "data/dry_season_malaria_2017_2019.csv")


# Kasungu district boundary shapefile 
kasungu_district <- st_read(here::here("data", "kasungu_district.shp"))
## Reading layer `kasungu_district' from data source `C:\Users\cnkolokosa\Documents\R\upscaled_2021_updated_May\upscaled_2021\data\kasungu_district.shp' using driver `ESRI Shapefile'
## Simple feature collection with 1 feature and 5 fields
## Geometry type: POLYGON
## Dimension:     XY
## Bounding box:  xmin: 491272.7 ymin: 8494349 xmax: 609044.2 ymax: 8632164
## Projected CRS: WGS 84 / UTM zone 36S
# Kasungu health facility catchments generated from accessibility mapping

malire_new <- sf::st_read(here::here("data", "new_catchments.shp")) %>% 
              sf::st_transform(32736) # reproject to WGS UTM Zone 36 South
## Reading layer `new_catchments' from data source `C:\Users\cnkolokosa\Documents\R\upscaled_2021_updated_May\upscaled_2021\data\new_catchments.shp' using driver `ESRI Shapefile'
## Simple feature collection with 27 features and 1 field
## Geometry type: POLYGON
## Dimension:     XY
## Bounding box:  xmin: 32.925 ymin: -13.61667 xmax: 34.00833 ymax: -12.375
## Geodetic CRS:  WGS 84
# Kasungu population raster layer
kasungu_population_2017 <- raster(here::here("data", "ku_pop_2017_1km_aggregated.tif"))

kasungu_population_2018 <- raster(here::here("data", "ku_pop_2018_1km_aggregated.tif"))

kasungu_population_2019 <- raster(here::here("data", "ku_pop_2019_1km_aggregated.tif"))

kasungu_population_2020 <- raster(here::here("data", "ku_pop_2020_1km_aggregated.tif"))

# Read in waterbodies polygons 
dryseason_waterbodies_2017 <- st_read(here::here("data", "water_bodies_2017.shp"))
## Reading layer `water_bodies_2017' from data source `C:\Users\cnkolokosa\Documents\R\upscaled_2021_updated_May\upscaled_2021\data\water_bodies_2017.shp' using driver `ESRI Shapefile'
## Simple feature collection with 168 features and 1 field
## Geometry type: POLYGON
## Dimension:     XY
## Bounding box:  xmin: 514497 ymin: 8495941 xmax: 603149.8 ymax: 8620169
## Projected CRS: WGS 84 / UTM zone 36S
dryseason_waterbodies_2018 <- st_read(here::here("data", "kasungu_2018_water.shp"))
## Reading layer `kasungu_2018_water' from data source `C:\Users\cnkolokosa\Documents\R\upscaled_2021_updated_May\upscaled_2021\data\kasungu_2018_water.shp' using driver `ESRI Shapefile'
## Simple feature collection with 1105 features and 1 field
## Geometry type: POLYGON
## Dimension:     XY
## Bounding box:  xmin: 496807.6 ymin: 8494693 xmax: 607913.8 ymax: 8607747
## Projected CRS: WGS 84 / UTM zone 36S
dryseason_waterbodies_2019 <- st_read(here::here("data", "kasungu_2019_water.shp"))
## Reading layer `kasungu_2019_water' from data source `C:\Users\cnkolokosa\Documents\R\upscaled_2021_updated_May\upscaled_2021\data\kasungu_2019_water.shp' using driver `ESRI Shapefile'
## Simple feature collection with 1941 features and 1 field
## Geometry type: POLYGON
## Dimension:     XY
## Bounding box:  xmin: 494197.2 ymin: 8494693 xmax: 607913.8 ymax: 8617573
## Projected CRS: WGS 84 / UTM zone 36S
dryseason_waterbodies_2020 <- st_read(here::here("data", "water_bodies_2020.shp"))
## Reading layer `water_bodies_2020' from data source `C:\Users\cnkolokosa\Documents\R\upscaled_2021_updated_May\upscaled_2021\data\water_bodies_2020.shp' using driver `ESRI Shapefile'
## Simple feature collection with 266 features and 1 field
## Geometry type: POLYGON
## Dimension:     XY
## Bounding box:  xmin: 508985.6 ymin: 8495793 xmax: 585761.1 ymax: 8620169
## Projected CRS: WGS 84 / UTM zone 36S
# Add a field ID to water bodies polygons 
dryseason_waterbodies_2017$ID <- 1:nrow(dryseason_waterbodies_2017)

dryseason_waterbodies_2018$ID <- 1:nrow(dryseason_waterbodies_2018)

dryseason_waterbodies_2019$ID <- 1:nrow(dryseason_waterbodies_2019)

dryseason_waterbodies_2020$ID <- 1:nrow(dryseason_waterbodies_2020)

View the dry season malaria case data

We observe that Kasungu district has 30 health facilities classified as dispensary, health centre, district hospital and rural hospital, and the highest malaria cases were recorded at Kasungu District Hospital.

dry_season_malaria_2017_2020 %>% 
  summary()
##       FID           Names              dr_2017         dr_2018     
##  Min.   : 1.00   Length:36          Min.   :  427   Min.   :  749  
##  1st Qu.: 9.75   Class :character   1st Qu.: 2196   1st Qu.: 2424  
##  Median :18.50   Mode  :character   Median : 3132   Median : 3357  
##  Mean   :18.50                      Mean   : 3586   Mean   : 3909  
##  3rd Qu.:27.25                      3rd Qu.: 3993   3rd Qu.: 4684  
##  Max.   :36.00                      Max.   :16289   Max.   :15821  
##                                     NA's   :6       NA's   :6      
##     dr_2019         dr_2020         LONGITU         LATITUD      
##  Min.   :  533   Min.   :    0   Min.   :33.18   Min.   :-13.57  
##  1st Qu.: 1748   1st Qu.: 2286   1st Qu.:33.38   1st Qu.:-13.25  
##  Median : 2556   Median : 3824   Median :33.50   Median :-12.98  
##  Mean   : 2880   Mean   : 4657   Mean   :33.52   Mean   :-12.99  
##  3rd Qu.: 3284   3rd Qu.: 6212   3rd Qu.:33.68   3rd Qu.:-12.79  
##  Max.   :10721   Max.   :24424   Max.   :33.87   Max.   :-12.42  
##  NA's   :6                       NA's   :6       NA's   :6
dry_season_malaria_2017_2020 %>%  
  plotly::plot_ly(y = ~Names,
                  x = ~dr_2017,
                  type = "bar",
                  orientation = 'h',
                  name = "2017") %>%
  plotly::add_trace(x = ~ dr_2018,
                    name = "2018") %>%
  plotly::add_trace(x = ~ dr_2019,
                    name = "2019") %>% 
  plotly::add_trace(x = ~ dr_2020,
                    name = "2020") %>% 
  plotly::layout(#barmode = "stack",
                 xaxis = list(title = "Total malaria cases"),
                 yaxis = list(title = ""),
                 hovermode = "compare",
                 margin = list(b = 10,
                               t = 10,
                               pad = 2))

Fig.1 The total malaria cases recorded at each health-care facility in Kasungu district

Kasungu health-care facilities and their catchment areas

Heath facility catchment area is the area from which a health facility attracts patients. The new health facility catchments polygon was generated from generic accessibility mapping script adapted from https://malariaatlas.org/wp-content/uploads/accessibility/R_generic_accessibilty_mapping_script.r The script requires two user supplied datasets: the 2015 friction surface, which is available here: http://www.map.ox.ac.uk/accessibility_to_cities/, and a user-supplied .csv of points dry_season_malaria_2017_2020. The accumulated cost algorithm accCost and r.Cost algorithm in GRASS GIS were run to make the final output map of new health facility catchment boundaries.

# Using the complete.cases() function to select health centres with complete 
# longitude and latitude coordinates.
# The health facilities close to each other e.g., Kasalika Health Centre and 
# Kasungu District Hospital, and Kaluluma Rural Hospital and Nkhamenya Rural Hospital
# were aggregated in order to generate catchment areas that are geographically correct


zipatala_aggregated <- dry_season_malaria_2017_2020[complete.cases(dry_season_malaria_2017_2020),] %>% 
  dplyr::filter(Names != "Kasalika Health Centre",
                Names != "Bua Health Centre",
                Names != "Kaluluma Rural Hospital") 

zipatala_aggregated$dr_2017[which(zipatala_aggregated$Names == "Kasungu District Hospital")] <- 4528 + 16289

zipatala_aggregated$dr_2018[which(zipatala_aggregated$Names == "Kasungu District Hospital")] <- 4493 +15821

zipatala_aggregated$dr_2019[which(zipatala_aggregated$Names == "Kasungu District Hospital")] <- 2729 + 10721

zipatala_aggregated$dr_2020[which(zipatala_aggregated$Names == "Kasungu District Hospital")] <- 4368 + 24424
  
zipatala_aggregated$dr_2017[which(zipatala_aggregated$Names == "Nkhamenya Rural Hospital")] <- 2887 + 752

zipatala_aggregated$dr_2018[which(zipatala_aggregated$Names == "Nkhamenya Rural Hospital")] <- 851 + 3689

zipatala_aggregated$dr_2019[which(zipatala_aggregated$Names == "Nkhamenya Rural Hospital")] <- 533 + 4004

zipatala_aggregated$dr_2020[which(zipatala_aggregated$Names == "Nkhamenya Rural Hospital")] <- 3587 + 5929

zipatala_aggregated$dr_2017[which(zipatala_aggregated$Names == "Mziza Health Centre")] <- 3863 + 3489

zipatala_aggregated$dr_2018[which(zipatala_aggregated$Names == "Mziza Health Centre")] <- 2815 + 1804

zipatala_aggregated$dr_2019[which(zipatala_aggregated$Names == "Mziza Health Centre")] <- 2439 + 1740

zipatala_aggregated$dr_2020[which(zipatala_aggregated$Names == "Mziza Health Centre")] <- 6194 + 2397

# write.csv(zipatala_aggregated, "data/health_facilities_aggregated.csv")


health_facility_aggr_sf <- sf::st_as_sf(zipatala_aggregated,
                                        coords = c("LONGITU", "LATITUD"),
                                        crs = 4326, agr = "constant")

# st_write(health_facility_aggr_sf, "data/health_facilities_aggregated.shp")

View location of the health facilities in the new catchment areas

# Plot map
tm_shape(malire_new)+
  tm_polygons()+
  tm_shape(health_facility_aggr_sf)+
  tm_dots(size = .3, 
          col = "blue", 
          alpha = 0.5)+
  tm_text("Names", 
          size = .3, 
          just = "top", 
          col = "black", 
          remove.overlap = TRUE)+
  tm_layout(frame = FALSE,
            title = "New Kasungu health facility \n catchment boundaries",
            title.size = .8, 
            title.position = c("left", "top"))+
  tm_compass(position=c("right", "top"))+
  tm_scale_bar(breaks = c(0, 10, 20), 
               text.size = .5)
Fig 2. Kasungu health-care facilities and catchment areas

Fig 2. Kasungu health-care facilities and catchment areas

Kasungu district estimated population per grid-cell

# Take a glimpse at the WorldPop raster layers
kasungu_population_2017
## class      : RasterLayer 
## dimensions : 150, 128, 19200  (nrow, ncol, ncell)
## resolution : 920.0898, 918.7667  (x, y)
## extent     : 491272.7, 609044.2, 8494349, 8632164  (xmin, xmax, ymin, ymax)
## crs        : +proj=utm +zone=36 +south +datum=WGS84 +units=m +no_defs 
## source     : C:/Users/cnkolokosa/Documents/R/upscaled_2021_updated_May/upscaled_2021/data/ku_pop_2017_1km_aggregated.tif 
## names      : ku_pop_2017_1km_aggregated
kasungu_population_2018
## class      : RasterLayer 
## dimensions : 150, 128, 19200  (nrow, ncol, ncell)
## resolution : 920.0898, 918.7667  (x, y)
## extent     : 491272.7, 609044.2, 8494349, 8632164  (xmin, xmax, ymin, ymax)
## crs        : +proj=utm +zone=36 +south +datum=WGS84 +units=m +no_defs 
## source     : C:/Users/cnkolokosa/Documents/R/upscaled_2021_updated_May/upscaled_2021/data/ku_pop_2018_1km_aggregated.tif 
## names      : ku_pop_2018_1km_aggregated 
## values     : 0, 6253.557  (min, max)
kasungu_population_2019
## class      : RasterLayer 
## dimensions : 150, 128, 19200  (nrow, ncol, ncell)
## resolution : 920.0898, 918.7667  (x, y)
## extent     : 491272.7, 609044.2, 8494349, 8632164  (xmin, xmax, ymin, ymax)
## crs        : +proj=utm +zone=36 +south +datum=WGS84 +units=m +no_defs 
## source     : C:/Users/cnkolokosa/Documents/R/upscaled_2021_updated_May/upscaled_2021/data/ku_pop_2019_1km_aggregated.tif 
## names      : ku_pop_2019_1km_aggregated 
## values     : 0, 6483.727  (min, max)
kasungu_population_2020
## class      : RasterLayer 
## dimensions : 150, 128, 19200  (nrow, ncol, ncell)
## resolution : 920.0898, 918.7667  (x, y)
## extent     : 491272.7, 609044.2, 8494349, 8632164  (xmin, xmax, ymin, ymax)
## crs        : +proj=utm +zone=36 +south +datum=WGS84 +units=m +no_defs 
## source     : C:/Users/cnkolokosa/Documents/R/upscaled_2021_updated_May/upscaled_2021/data/ku_pop_2020_1km_aggregated.tif 
## names      : ku_pop_2020_1km_aggregated 
## values     : 0, 7949.033  (min, max)
# Function to create a raster population map
create.population.map <- function(population.raster, title){
  # raster population map
  # arguments:
  #   population.raster:  aggregated population raster layer from WorldPop
  #   legend.title: legend title
  # returns:
  #   a tmap-element (plots a map)
  tm_shape(population.raster)+
    tm_raster(palette = "YlOrBr", 
              title = title,
              breaks = c(0,100,200,400,600,800,1000,2000,4000,6000,8000))+
    tm_layout(legend.position = c("right", "bottom"),
              frame = FALSE)+
    tm_scale_bar(position = c("left", "bottom"))
}
# Set to static map
tmap_mode("plot")

estimated_pop_2017 <- create.population.map(kasungu_population_2017, title = "2017 Population")

estimated_pop_2018 <- create.population.map(kasungu_population_2018, title = "2018 Population")

estimated_pop_2019 <- create.population.map(kasungu_population_2019, title = "2019 Population")

estimated_pop_2020 <- create.population.map(kasungu_population_2020, title = "2020 Population")

# Layout the maps
tmap_arrange(estimated_pop_2017, estimated_pop_2018, estimated_pop_2019, estimated_pop_2020, nrow = 2) 
Fig.3 Estimated total number of people per 1km grid-cell

Fig.3 Estimated total number of people per 1km grid-cell

Assign dry season malaria cases and population density to new health facility catchments

The WorldPop aggregated population e.g. kasungu_population_2017.tif, and DHIS2 malaria dry_season_malaria_2017_2020 datasets are assigned to the new health facility catchments.

# Create a function that assigns malaria data from health facilities to their catchments areas ----------------
assign.malaria.data <- function(catchment_boundary, malaria_data){
  # function to assign malaria cases from health facilities to their catchment area
  # arguments:
  #   catchment_boundary: sf polygon object of new catchment boundaries
  #   malaria_data: sf point object with a data frame containing the dry season malaria cases
  # returns:
  #   catchments_malaria_sf: sf polygon object with a data frame containing dry season malaria cases


  # Convert sf objects to spatial
  catchment_shp <- as(catchment_boundary, "Spatial")
  
  malaria_shp <- as(malaria_data, "Spatial")

  # Match CRS
  malaria_shp <- spTransform(malaria_shp, crs(catchment_shp))

  # Overlay aggregated health facility points and extract 2017 - 2020 malaria cases
  # Using 'point.in.poly' to return a point spatial object, in this case location of health facilities
  # and estimated population instead of sp::over function, which simply returns 
  # a data frame, with the same no. rows.
  # Argument 'sp = TRUE' returns an sp class object, else returns sf class object
  # Joining the malaria and population dataset using only 'merge' function can't work due to 
  # non-unique columns and differences in row numbers
  
  hospitals_in_catchment <- spatialEco::point.in.poly(malaria_shp, catchment_shp, sp = TRUE) 
 

  # Add the extracted ID, health facility names and dry season malaria cases to 
  # the health facility catchments (hfc)
  hfc_malaria_shp <- merge(catchment_shp, hospitals_in_catchment, by.x = "DN", by.y = "FID")

  # Convert the shapefile containing malaria data to sf-object
  hfc_malaria_sf <- sf::st_as_sf(hfc_malaria_shp)

  # Tidy the data by dropping columns not needed
  catchment_malaria <- hfc_malaria_sf %>% 
    dplyr::select(-c(coords.x1, coords.x2))

  return(out = catchment_malaria)
}


# Invoking the function ----------------------------------------------------------------------------------
malaria_by_catchment <- assign.malaria.data(malire_new, health_facility_aggr_sf)

Assign population data to the health catchment areas

# Create a function to extract population from WorldPop raster file and assign ---------------------------
# the values to the new catchments.

extract.pop.values <- function(kasungu_pop_raster, catchments){
  # function to extract population from raster file and assign the population to catchments
  # arguments:
  #   kasungu_pop_raster: population raster file clipped to Kasungu district
  #   catchments: shapefile containing the polygons that we wish to use as boundaries
  # returns:
  #   catchments_malaria_pop_sf: sf polygon object containing malaria and population data
  
  # convert from sf to sp
  catchments_sp <- as(catchments, "Spatial")
  
  # Match extent i.e projection
  catchments_sp <- spTransform(catchments_sp, proj4string(kasungu_pop_raster))
  
  # Crop and mask the population raster to exclude Kasungu National Park
  pop_raster_clip <- raster::mask(raster::crop(kasungu_pop_raster, extent(catchments_sp)), catchments_sp)
  
  # Extracting zonal statistics from a population raster layer. 
  # The population raster is a continuous gridded surface layer that has an 
  # estimated population density value to every square in their grid. 
  # The population values are then summed and apportioned to the catchment polygons
  # catchments_malaria_pop <- catchments %>% 
  #   dplyr::mutate(pop = round(raster::extract(pop_raster_clip, catchments, fun = sum, na.rm = TRUE)))
  
  pop_by_catchment <- round(raster::extract(pop_raster_clip, catchments, fun = sum, na.rm = TRUE))
  
  pop_by_catchment_df <-  pop_by_catchment %>%  
  # apply unlist to the lists to have vectors as the list elements
  lapply(unlist) %>% 
  # convert vectors to data.frames
  lapply(as_tibble) %>% 
  # combine the list of data.frames
  bind_rows(., .id = "rowID") %>% 
  # rename the value variable
  dplyr::rename(pop = value)
  
  # Add row ID to column to catchment layer
  catchments$rowID <- 1:nrow(catchments)
  
  # Merge catchment areas and population data 
  pop_by_catchments <- merge(catchments, pop_by_catchment_df, by.x = "rowID", by.y = "rowID")
  
  # Cleaning 'Inf' values
  pop_by_catchments %>% 
    dplyr::mutate_if(is.numeric, list(~na_if(., Inf))) %>% 
    dplyr::mutate_if(is.numeric, list(~na_if(., -Inf)))

  return(out = pop_by_catchments)
  
}

# Invoking the function ---------------------------------------------------------------------------------------
malaria_pop_by_catchment_2017 <- extract.pop.values(kasungu_population_2017, malaria_by_catchment)

malaria_pop_by_catchment_2018 <- extract.pop.values(kasungu_population_2018, malaria_by_catchment)

malaria_pop_by_catchment_2019 <- extract.pop.values(kasungu_population_2019, malaria_by_catchment)

malaria_pop_by_catchment_2020 <- extract.pop.values(kasungu_population_2020, malaria_by_catchment)

View Kasungu population by catchment maps

Estimated total number of people within health facility catchment areas.

# Function to create maps of estimated population by catchment areas ---------------------------------------
create.population.map <- function(catchment.area, 
                                  variable = "pop", 
                                  title, 
                                  legend.title = "Estimated \n population"){
  # estimated population map
  # catchment.area: estimated population layer from nachulu function
  # variable: variable name (as character, in qoutes)
  # title: map title in quotes
  # legend.title: legend title in qoutes
  # returns:
  #   a tmap-element (plots a map)
  tm_shape(catchment.area)+
    tm_fill(col = variable, 
            breaks = c(0, 13000, 19000, 27000, 35000, 70000, 140000, 200000),
            palette = "YlOrBr",
            title = legend.title)+
    tm_borders(col = "grey",
               lwd = 0.4)+
    tm_layout(legend.position = c(0.75, "bottom"),
              legend.text.size = 0.6,
              legend.title.size = 0.8,
              frame = FALSE)+
    tm_credits(title, 
               position = c(0.3, 0.8), 
               size = 1)
}

# Invoking the function --------------------------------------------------------------------------------
pop_by_catchment_2017 <- create.population.map(malaria_pop_by_catchment_2017, title = "2017")

pop_by_catchment_2018 <- create.population.map(malaria_pop_by_catchment_2018, title = "2018")

pop_by_catchment_2019 <- create.population.map(malaria_pop_by_catchment_2019, title = "2019")

pop_by_catchment_2020 <- create.population.map(malaria_pop_by_catchment_2020, title = "2020")

tmap::tmap_arrange(pop_by_catchment_2017, pop_by_catchment_2018,
                   pop_by_catchment_2019, pop_by_catchment_2020, ncol = 2)
Fig. 4: Estimated population in Kasungu health facility catchment areas

Fig. 4: Estimated population in Kasungu health facility catchment areas

Calculate the expected number of cases for each catchment area

The expected number of dry season malaria cases in catchment i are calculated as the observed risk (r) of malaria i.e. the total number of malaria cases in Kasungu district divided by the total population of the district, multiplied by the number of people in the catchment area: \[E_i = \frac{\sum_i O_i}{\sum_i N_i}\times N_i\]

The expected number of dry season malaria cases are calculated under the assumption that there is no spatial variation in risk, i.e., no difference in infection rates between the catchment areas.

# Calculate expected malaria cases --------------------------------------------------------------
expected_malaria_2017 <- malaria_pop_by_catchment_2017 %>% 
  dplyr::rename(
    observed_2017 = dr_2017,
     pop_2017 = pop) %>% 
  dplyr::mutate(
    expected_2017 = round(sum(observed_2017)/sum(pop_2017, na.rm = TRUE)*pop_2017))

expected_malaria_2018 <- malaria_pop_by_catchment_2018 %>% 
  dplyr::rename(
    observed_2018 = dr_2018,
    pop_2018 = pop) %>% 
  dplyr::mutate(
    expected_2018 = round(sum(observed_2018)/sum(pop_2018, na.rm = TRUE)*pop_2018))

expected_malaria_2019 <- malaria_pop_by_catchment_2019 %>% 
  dplyr::rename(
    observed_2019 = dr_2019,
    pop_2019 = pop) %>%
  dplyr::mutate(
    expected_2019 = round(sum(observed_2019)/sum(pop_2019, na.rm = TRUE)*pop_2019)) 

expected_malaria_2020 <- malaria_pop_by_catchment_2020 %>% 
  dplyr::rename(
    observed_2020 = dr_2020,
    pop_2020 = pop) %>% 
  dplyr::mutate(
    expected_2020 = round(sum(observed_2020)/sum(pop_2020, na.rm = TRUE)*pop_2020))

Calculate the Standardised Morbidity Ratio of malaria incidences for each catchment area

The SMR compares the risk of morbidity in a population of interest with that of a standard population. In this case, our interest is to find out whether the number of dry season malaria cases in each catchment area are greater than we would expect given the malaria rate for the entire Kasungu district.

We do this by comparing what we observe (O) with what we would expect (E) if the risk of malaria was equal throughout Kasungu. The SMR statistical notation of catchment i can be written as follows: \[SMR_i = \frac{O_i}{E_i}\]

# Calculate Standard Morbidity Ratio (SMR) -------------------------------------
SMR_2017 <- expected_malaria_2017 %>% 
  dplyr::mutate(SMR = round(observed_2017/expected_2017, 1)) %>% 
  dplyr::select(rowID,Names, pop_2017, observed_2017, expected_2017, SMR) 

SMR_2018 <- expected_malaria_2018 %>% 
  dplyr::mutate(SMR = round(observed_2018/expected_2018, 1)) %>% 
  dplyr::select(rowID, Names, pop_2018, observed_2018, expected_2018, SMR) 

SMR_2019 <- expected_malaria_2019 %>% 
  dplyr::mutate(SMR = round(observed_2019/expected_2019, 1)) %>% 
  dplyr::select(rowID, Names, pop_2019, observed_2019, expected_2019, SMR) 

SMR_2020 <- expected_malaria_2020 %>% 
  dplyr::mutate(SMR = round(observed_2020/expected_2020, 1)) %>% 
  dplyr::select(rowID, Names, pop_2020, observed_2020, expected_2020, SMR)


# Create SMR tables ------------------------------------------------------------
SMR_table_2017 <- SMR_2017 %>% 
  dplyr::as_tibble() %>% 
  dplyr::select(-rowID, -geometry) %>% 
  kable %>%
  kableExtra::kable_styling(full_width = FALSE)

SMR_table_2018 <- SMR_2018 %>% 
  dplyr::as_tibble() %>% 
  dplyr::select(-rowID, -geometry) %>% 
  kable %>% 
  kableExtra::kable_styling(full_width = FALSE)

SMR_table_2019 <- SMR_2019 %>% 
  dplyr::as_tibble() %>% 
  dplyr::select(-rowID, -geometry) %>% 
  kable %>% 
  kableExtra::kable_styling(full_width = FALSE)

SMR_table_2020 <- SMR_2020 %>% 
  dplyr::as_tibble() %>% 
  dplyr::select(-rowID, -geometry) %>% 
  kable %>% 
  kableExtra::kable_styling(full_width = FALSE)

SMR_table_2017
Names pop_2017 observed_2017 expected_2017 SMR
Lodjwa Health Centre 9923 1161 1372 0.8
Nkhamenya Rural Hospital 40154 3639 5553 0.7
Newa Mpasazi Health Centre 13879 427 1919 0.2
Mpepa /Chisinga Health Centre 27459 2784 3797 0.7
Mnyanja Health Centre 39950 3298 5524 0.6
Simlemba Health Centre 26999 2197 3734 0.6
Ofesi Health Centre 28098 4036 3886 1.0
Chulu Health Centre 27906 5801 3859 1.5
Kapelula Health Centre 35727 4138 4940 0.8
Livwezi Health Centre 22009 1307 3043 0.4
Gogode Dispensary 13061 2745 1806 1.5
Dwangwa Dispensary 32704 2192 4522 0.5
Chamama Health Facility 20026 1878 2769 0.7
Wimbe Health Centre 11864 5660 1641 3.4
Chinyama 12768 2292 1766 1.3
Mdunga Health Centre 18177 2045 2514 0.8
Mtunthama Health Centre 18744 3622 2592 1.4
Kasungu District Hospital 143490 20817 19842 1.0
Chamwabvi Health Centre 35353 3601 4889 0.7
Linyangwa Health Centre 17772 3359 2458 1.4
Kawamba Health Centre 22865 5808 3162 1.8
Mziza Health Centre 44189 7352 6111 1.2
Kamboni Health Centre 21226 3824 2935 1.3
Khola Health Centre 16956 2195 2345 0.9
Santhe Health Centre 6096 5838 843 6.9
Anchor Farm 48861 2966 6757 0.4
Mkhota Health Centre 21621 2586 2990 0.9
SMR_table_2018
Names pop_2018 observed_2018 expected_2018 SMR
Lodjwa Health Centre 10281 1151 1508 0.8
Nkhamenya Rural Hospital 41642 4540 6107 0.7
Newa Mpasazi Health Centre 14248 749 2089 0.4
Mpepa /Chisinga Health Centre 28488 3602 4178 0.9
Mnyanja Health Centre 41856 2864 6138 0.5
Simlemba Health Centre 27455 2515 4026 0.6
Ofesi Health Centre 29002 3446 4253 0.8
Chulu Health Centre 28832 4502 4228 1.1
Kapelula Health Centre 37630 5338 5518 1.0
Livwezi Health Centre 22544 2361 3306 0.7
Gogode Dispensary 13368 4138 1960 2.1
Dwangwa Dispensary 33534 2394 4918 0.5
Chamama Health Facility 20372 1750 2987 0.6
Wimbe Health Centre 11814 5010 1732 2.9
Chinyama 13138 2116 1927 1.1
Mdunga Health Centre 18928 2923 2776 1.1
Mtunthama Health Centre 19074 5308 2797 1.9
Kasungu District Hospital 147175 20314 21582 0.9
Chamwabvi Health Centre 36167 4027 5304 0.8
Linyangwa Health Centre 18032 3268 2644 1.2
Kawamba Health Centre 22902 6462 3358 1.9
Mziza Health Centre 46208 4619 6776 0.7
Kamboni Health Centre 21430 4745 3143 1.5
Khola Health Centre 17315 2802 2539 1.1
Santhe Health Centre 6244 7267 916 7.9
Anchor Farm 49871 3142 7313 0.4
Mkhota Health Centre 22167 5920 3251 1.8
SMR_table_2019
Names pop_2019 observed_2019 expected_2019 SMR
Lodjwa Health Centre 10608 1713 1114 1.5
Nkhamenya Rural Hospital 43293 4537 4546 1.0
Newa Mpasazi Health Centre 14780 809 1552 0.5
Mpepa /Chisinga Health Centre 29456 4248 3093 1.4
Mnyanja Health Centre 43783 3148 4597 0.7
Simlemba Health Centre 28076 2339 2948 0.8
Ofesi Health Centre 30065 4079 3157 1.3
Chulu Health Centre 29731 5470 3122 1.8
Kapelula Health Centre 39747 2558 4173 0.6
Livwezi Health Centre 22945 787 2409 0.3
Gogode Dispensary 13641 2200 1432 1.5
Dwangwa Dispensary 34415 2553 3614 0.7
Chamama Health Facility 20701 1508 2174 0.7
Wimbe Health Centre 11855 3041 1245 2.4
Chinyama 13475 1770 1415 1.3
Mdunga Health Centre 19960 1008 2096 0.5
Mtunthama Health Centre 19385 2763 2035 1.4
Kasungu District Hospital 151079 13450 15863 0.8
Chamwabvi Health Centre 36899 1686 3874 0.4
Linyangwa Health Centre 18279 3330 1919 1.7
Kawamba Health Centre 23041 4402 2419 1.8
Mziza Health Centre 48340 4179 5076 0.8
Kamboni Health Centre 21509 2770 2258 1.2
Khola Health Centre 17761 2708 1865 1.5
Santhe Health Centre 6435 5210 676 7.7
Anchor Farm 50995 1978 5354 0.4
Mkhota Health Centre 22677 2162 2381 0.9
SMR_table_2020
Names pop_2020 observed_2020 expected_2020 SMR
Lodjwa Health Centre 13081 1955 2105 0.9
Nkhamenya Rural Hospital 53692 9516 8641 1.1
Newa Mpasazi Health Centre 18311 3318 2947 1.1
Mpepa /Chisinga Health Centre 36317 7588 5844 1.3
Mnyanja Health Centre 54649 7259 8795 0.8
Simlemba Health Centre 34240 8051 5510 1.5
Ofesi Health Centre 37240 4311 5993 0.7
Chulu Health Centre 36638 9628 5896 1.6
Kapelula Health Centre 50214 7894 8081 1.0
Livwezi Health Centre 27786 1339 4472 0.3
Gogode Dispensary 16681 3957 2684 1.5
Dwangwa Dispensary 42282 4196 6804 0.6
Chamama Health Facility 25248 1490 4063 0.4
Wimbe Health Centre 14367 3690 2312 1.6
Chinyama 16463 2411 2649 0.9
Mdunga Health Centre 25108 3332 4041 0.8
Mtunthama Health Centre 23501 2901 3782 0.8
Kasungu District Hospital 185282 28792 29817 1.0
Chamwabvi Health Centre 45106 1386 7259 0.2
Linyangwa Health Centre 22144 6116 3564 1.7
Kawamba Health Centre 27961 7949 4500 1.8
Mziza Health Centre 60510 8591 9738 0.9
Kamboni Health Centre 25750 6265 4144 1.5
Khola Health Centre 21929 4918 3529 1.4
Santhe Health Centre 7917 7242 1274 5.7
Anchor Farm 62633 2837 10079 0.3
Mkhota Health Centre 27830 6070 4479 1.4

View observed and expected dry season malaria cases

# Function to create maps of observed and expected dry season malaria cases
create.malaria.map <- function(malaria.data, 
                               variable = NA, 
                               title = NA, 
                               legend.title = NA){
  # observed and expected malaria incidence map
  # malaria.data: data frame containing observed and expected malaria cases
  # variable: variable name (as character, in qoutes e.g. "observed")
  # title: map title in quotes
  # legend.title: legend title in qoutes
  # returns:
  #   a tmap-element (plots a map)
  tm_shape(malaria.data)+
    tm_fill(col = variable, 
            breaks = c(0, 500, 1000, 2500, 5000, 10000, 15000, 25000, 35000),
            palette = "YlOrRd",
            title = legend.title)+
    tm_borders(lw = 0.3)+
    tm_layout(legend.position = c(0.75,"bottom"),
              legend.text.size = 0.5,
              legend.title.size = 0.7,
              frame = FALSE)+
    tm_credits(title, 
               position = c(0.2, 0.8), 
               size = 1)
}

# Invoking the function
# 2017 observed and expected malaria cases -------------------------------------
observed_malaria_2017_map <- create.malaria.map(malaria_pop_by_catchment_2017, 
                                                variable = "dr_2017",
                                                title = "2017",
                                                legend.title = "Observed malaria")

expected_malaria_2017_map <- create.malaria.map(expected_malaria_2017,
                                                variable = "expected_2017",
                                                title = "2017",
                                                legend.title = "Expected malaria")

# 2018 observed and expected malaria cases -------------------------------------
observed_malaria_2018_map <- create.malaria.map(malaria_pop_by_catchment_2018,
                                                variable = "dr_2018",
                                                title = "2018",
                                                legend.title = "Observed malaria")

expected_malaria_2018_map <- create.malaria.map(expected_malaria_2018,
                                                variable = "expected_2018",
                                                title = "2018",
                                                legend.title = "Expected malaria")

# 2019 observed and expected malaria cases -------------------------------------
observed_malaria_2019_map <- create.malaria.map(malaria_pop_by_catchment_2019,
                                                variable = "dr_2019",
                                                title = "2019",
                                                legend.title = "Observed malaria")

expected_malaria_2019_map <- create.malaria.map(expected_malaria_2019,
                                                variable = "expected_2019",
                                                title = "2019",
                                                legend.title = "Expected malaria")

# 2020 observed and expected malaria cases -------------------------------------
observed_malaria_2020_map <- create.malaria.map(malaria_pop_by_catchment_2020,
                                                variable = "dr_2020",
                                                title = "2020",
                                                legend.title = "Observed malaria")

expected_malaria_2020_map <- create.malaria.map(expected_malaria_2020,
                                                variable = "expected_2020",
                                                title = "2020",
                                                legend.title = "Expected malaria")

# Layout maps ------------------------------------------------------------------
tmap::tmap_arrange(observed_malaria_2017_map, expected_malaria_2017_map,
                   observed_malaria_2018_map, expected_malaria_2018_map, 
                   observed_malaria_2019_map, expected_malaria_2019_map,
                   observed_malaria_2020_map, expected_malaria_2020_map, ncol = 2)
Fig 5: Observed and expected malaria incidence by health facility catchment area, Kasungu

Fig 5: Observed and expected malaria incidence by health facility catchment area, Kasungu

View SMR by catchment

A ratio greater than 1.0 indicates that more malaria cases have occurred than would have been expected, while a ratio less than 1.0 indicates that less cases have occurred.

#   max(SMR_2017$SMR)
# [1] 6.93
# > max(SMR_2018$SMR)
# [1] 7.93
# > max(SMR_2019$SMR)
# [1] 7.71
# > max(SMR_2020$SMR)
# [1] 5.68

# Function to create maps of Standard Morbidity Rate by catchment ---------------------
create.smr.map <- function(smr.data, 
                           variable = "SMR", 
                           title = NA, 
                           legend.title = "SMR"){
  # SMR by catchment map
  # smr.data: sf polygon object containing SMR by catchment data
  # variable: variable name (as character, in qoutes)
  # title: map title in quotes
  # legend.title: legend title in qoutes
  # returns:
  #   a tmap-element (plots a map)
 
  tm_shape(smr.data)+
    tm_fill(col = variable, 
            breaks = c(0, 0.5, 1, 1.5, 2, 2.5, 5, 8),
            palette = "-magma",
            title = legend.title)+
    tm_borders(lw = 0.3)+
    tm_layout(legend.position = c(0.75,"bottom"),
              legend.text.size = 0.5,
              legend.title.size = 0.7,
              frame = FALSE)+
    tm_credits(title, 
               position = c(0.2, 0.8), 
               size = 1)
}

# Invoking function -------------------------------------------------------------------
SMR_2017_map <- create.smr.map(SMR_2017, title = "2017")

SMR_2018_map <- create.smr.map(SMR_2018, title = "2018")

SMR_2019_map <- create.smr.map(SMR_2019, title = "2019")

SMR_2020_map <- create.smr.map(SMR_2020, title = "2020")

# Layout maps -------------------------------------------------------------------------
tmap::tmap_arrange(SMR_2017_map, SMR_2018_map, SMR_2019_map, SMR_2020_map, ncol = 2)
Fig. 6: Standardised morbidity ratio of malaria by health facility catchment

Fig. 6: Standardised morbidity ratio of malaria by health facility catchment

Calculate the proportion of the catchment population living within 1km, 2km, 3km of water bodies

First, using st_buffer, we compute 1km, 2km and 3km buffers around dry season water bodies obtained from LandSat satellite imagery using TropWet tool in Google Earth Engine. Then geometry of the buffer features are then combined resulting in resolved internal boundaries to enable extracting population values from WorldPop raster. Finally, we calculate the proportion of people in each catchment area living within water bodies.

# Combine and transform TropWet derived waterbody polygons -------------------------------
surface_waterbodies_2017 <- sf::st_as_sf(
  st_cast(
    st_union(
      st_buffer(dryseason_waterbodies_2017, dist = 30)), "POLYGON"))

surface_waterbodies_2018 <- sf::st_as_sf(
  st_cast(
    st_union(
      st_buffer(dryseason_waterbodies_2018, dist = 30)), "POLYGON"))

surface_waterbodies_2019 <- sf::st_as_sf(
  st_cast(
    st_union(
      st_buffer(dryseason_waterbodies_2019, dist = 30)), "POLYGON"))

surface_waterbodies_2020 <- sf::st_as_sf(
  st_cast(
    st_union(
      st_buffer(dryseason_waterbodies_2020, dist = 30)), "POLYGON"))

# Create function to compute 1km, 2km and 3km buffers around the water bodies ---------------------

create.waterbody.buffer <- function(waterbody, distance, catchment){
  # function for creating buffers around waterbodies
  # arguments:
  #   waterbody:  waterbody shapefile
  #   distance: buffer distance in meters
  #   catchment: catchment area shapefile
  # returns:
  #   buffered waterbodies 
  
  # Create buffers around water bodies
  buffer_radius <- sf::st_buffer(waterbody, distance)
  
  # Dissolve the buffers
  buffer_union <- sf::st_as_sf(st_cast(st_union(buffer_radius),"MULTIPOLYGON"))
  
  # Assign attributes of the 'catchment' to each of the water bodies. 
   buffer_intersect <- sf::st_intersection(buffer_union, catchment)
  
   buffer_intersect_sf <- sf::st_as_sf(buffer_intersect)
   
  # Convert the MULTIPOLYGON object into several POLYGON objects
   buffer_intersect_polygons <- sf::st_cast(
     sf::st_buffer(buffer_intersect_sf,0.0), "MULTIPOLYGON") %>% 
     sf::st_cast("POLYGON")
  
  # Polygons being seen to be in multiple catchments
   sf::st_intersects(buffer_intersect_polygons, catchment)
  
  # Make the assumption that the attribute is constant throughout the geometry
   sf::st_agr(buffer_intersect_polygons) = "constant"
   
   sf::st_agr(catchment) = "constant"
  
  return(out = buffer_intersect_polygons)
}


# Invoking function
# For 2017 TropWet surface water polygons --------------------------------------------------------
buffer_1km_2017 <- create.waterbody.buffer(waterbody = surface_waterbodies_2017, 
                                           distance = 1000, 
                                           catchment = malire_new)

buffer_2km_2017 <- create.waterbody.buffer(waterbody = surface_waterbodies_2017, 
                                           distance = 2000, 
                                           catchment = malire_new)

buffer_3km_2017 <- create.waterbody.buffer(waterbody = surface_waterbodies_2017, 
                                           distance = 3000,
                                           catchment = malire_new)

# For 2018 TropWet surface water polygons --------------------------------------------------------
buffer_1km_2018 <- create.waterbody.buffer(waterbody = surface_waterbodies_2018, 
                                           distance = 1000, 
                                           catchment = malire_new)

buffer_2km_2018 <- create.waterbody.buffer(waterbody = surface_waterbodies_2018, 
                                           distance = 2000, 
                                           catchment = malire_new)

buffer_3km_2018 <- create.waterbody.buffer(waterbody = surface_waterbodies_2018, 
                                           distance = 3000, 
                                           catchment = malire_new)
 
# For 2019 TropWet surface water polygons ------------------------------------------------------
buffer_1km_2019 <- create.waterbody.buffer(waterbody = surface_waterbodies_2019, 
                                           distance = 1000, 
                                           catchment = malire_new)

buffer_2km_2019 <- create.waterbody.buffer(waterbody = surface_waterbodies_2019, 
                                           distance = 2000, 
                                           catchment = malire_new)

buffer_3km_2019 <- create.waterbody.buffer(waterbody = surface_waterbodies_2019, 
                                           distance = 3000, 
                                           catchment = malire_new)

# For 2020 TropWet surface water polygons ------------------------------------------------------
buffer_1km_2020 <- create.waterbody.buffer(waterbody = surface_waterbodies_2020, 
                                           distance = 1000, 
                                           catchment = malire_new)

buffer_2km_2020 <- create.waterbody.buffer(waterbody = surface_waterbodies_2020, 
                                           distance = 2000, 
                                           catchment = malire_new)

buffer_3km_2020 <- create.waterbody.buffer(waterbody = surface_waterbodies_2020, 
                                           distance = 3000, 
                                           catchment = malire_new)

View the created waterbody buffers

# Map the buffers
create.buffer.map <- function(buffers, boundary = malire_new, title = NA){
  # function for creating buffer map in ggplot
  # arguments:
  #   buffer:  waterbodies buffer polygon layer
  #   boundary: health facility catchment polygons
  #   title: main title
  # returns:
  #   a map-element (plots a map)
  ggplot(data = buffers)+
     geom_sf()+
     geom_sf(data = boundary, 
             fill = NA)+
     theme_void()+
     labs(title = title)
}

# Invoking the function
# For 2017 -------------------------------------------------------------------------------
buffer_1km_2017_map <- create.buffer.map(buffer_1km_2017, title = "2017: 1km Buffers")

buffer_2km_2017_map <- create.buffer.map(buffer_2km_2017, title = "2017: 2km Buffers")

buffer_3km_2017_map <- create.buffer.map(buffer_3km_2017, title = "2017: 3km Buffers")

# For 2018 --------------------------------------------------------------------------------
buffer_1km_2018_map <- create.buffer.map(buffer_1km_2018, title = "2018: 1km Buffers")

buffer_2km_2018_map <- create.buffer.map(buffer_2km_2018, title = "2018: 2km Buffers")

buffer_3km_2018_map <- create.buffer.map(buffer_3km_2018, title = "2018: 3km Buffers")

# For 2019 ---------------------------------------------------------------------------------
buffer_1km_2019_map <- create.buffer.map(buffer_1km_2019, title = "2019: 1km Buffers")

buffer_2km_2019_map <- create.buffer.map(buffer_2km_2019, title = "2019: 2km Buffers")

buffer_3km_2019_map <- create.buffer.map(buffer_3km_2019, title = "2019: 3km Buffers")

# For 2020 --------------------------------------------------------------------------------
buffer_1km_2020_map <- create.buffer.map(buffer_1km_2020, title = "2020: 1km Buffers")

buffer_2km_2020_map <- create.buffer.map(buffer_2km_2020, title = "2020: 2km Buffers")

buffer_3km_2020_map <- create.buffer.map(buffer_3km_2020, title = "2020: 3km Buffers")
 
grid.arrange(buffer_1km_2017_map, buffer_1km_2018_map, buffer_1km_2019_map, buffer_1km_2020_map,
             buffer_2km_2017_map, buffer_2km_2018_map, buffer_2km_2019_map, buffer_2km_2020_map, 
             buffer_3km_2017_map, buffer_3km_2018_map, buffer_3km_2019_map, buffer_3km_2020_map, ncol = 4)
Fig 7. Buffers around dry season waterbodies in Kasungu

Fig 7. Buffers around dry season waterbodies in Kasungu

Extract the population living within waterbody buffers by catchment area

# Function to calculate estimated number of people living within waterbody buffers
# in each catchment area
estimate.buffer.pop <- function(catchment.population, buffers, catchment.area){
  
  # Extract population estimates from WorldPop raster
  buffers$buffer_pop <- raster::extract(catchment.population,
                                        buffers, 
                                        fun = sum, 
                                        na.rm = TRUE)
                                               
                                              
  # Find which catchment each polygon belongs to using its centroid - a point dataset 
  # representing the geographic center-points of the polygons 
  buffer_by_catchment <- st_intersection(st_centroid(buffers), catchment.area)
  
  # Notice that the buffer_catchment is comprised of separate POLYGONS (buffer_by_catchment$x). 
  # The first step is to “dissolve” away these POLYGONS into one MULTIPOLYGON. 
  # There is no sf equivalent to the QGIS or ArcMap “dissolve” operation. 
  # Instead we use a combination of group_by and summarize from the dplyr package. 
  # Stats::aggregate from sf package, and dplyr::summarize both do essentially the same.
   buffer_pop_aggregated <- buffer_by_catchment %>% 
     dplyr::group_by(DN) %>%
     dplyr::summarize(
       buffer_pop_aggregated = round(sum(buffer_pop, na.rm = TRUE)))
   
  buffer_pop <- merge(
    catchment.area, st_drop_geometry(
      buffer_pop_aggregated), by = 'DN', all.x = TRUE)
  
  return(out = buffer_pop)
  
}

# Invoking the function and calculating proportion of 
# catchment population living within buffers
# 2017 buffer population -------------------------------------------------------
buffer_pop_1km_2017 <- estimate.buffer.pop(
  kasungu_population_2017, 
  buffer_1km_2017, 
   malaria_pop_by_catchment_2017) %>% 
  dplyr::rename(catchment_pop = pop,
                buffer_pop = buffer_pop_aggregated) %>% 
  dplyr::mutate(
    prop_buffer_catchment_pop = round((buffer_pop/catchment_pop)*100)) 

buffer_pop_2km_2017 <- estimate.buffer.pop(
  kasungu_population_2017,
  buffer_2km_2017,
  malaria_pop_by_catchment_2017) %>% 
  dplyr::rename(catchment_pop = pop,
                buffer_pop = buffer_pop_aggregated) %>% 
  dplyr::mutate(
    prop_buffer_catchment_pop = round((buffer_pop/catchment_pop)*100)) 

buffer_pop_3km_2017 <- estimate.buffer.pop(
  kasungu_population_2017,
  buffer_3km_2017,
  malaria_pop_by_catchment_2017) %>% 
  dplyr::rename(catchment_pop = pop,
                buffer_pop = buffer_pop_aggregated) %>% 
  dplyr::mutate(
    prop_buffer_catchment_pop = round((buffer_pop/catchment_pop)*100))

# 2018 buffer population -------------------------------------------------------
buffer_pop_1km_2018 <- estimate.buffer.pop(
  kasungu_population_2018,
  buffer_1km_2018,
  malaria_pop_by_catchment_2018) %>% 
  dplyr::rename(catchment_pop = pop,
                buffer_pop = buffer_pop_aggregated) %>% 
  dplyr::mutate(
    prop_buffer_catchment_pop = round((buffer_pop/catchment_pop)*100))

buffer_pop_2km_2018 <- estimate.buffer.pop(
  kasungu_population_2018,
  buffer_2km_2018,
  malaria_pop_by_catchment_2018) %>% 
  dplyr::rename(catchment_pop = pop,
                buffer_pop = buffer_pop_aggregated) %>% 
  dplyr::mutate(
    prop_buffer_catchment_pop = round((buffer_pop/catchment_pop)*100))

buffer_pop_3km_2018 <- estimate.buffer.pop(
  kasungu_population_2018,
  buffer_3km_2018,
  malaria_pop_by_catchment_2018) %>% 
  dplyr::rename(catchment_pop = pop,
                buffer_pop = buffer_pop_aggregated) %>% 
  dplyr::mutate(
    prop_buffer_catchment_pop = round((buffer_pop/catchment_pop)*100))

# 2019 buffer population -------------------------------------------------------
buffer_pop_1km_2019 <- estimate.buffer.pop(
  kasungu_population_2019,
  buffer_1km_2019,
  malaria_pop_by_catchment_2019) %>% 
  dplyr::rename(catchment_pop = pop,
                buffer_pop = buffer_pop_aggregated) %>% 
  dplyr::mutate(
    prop_buffer_catchment_pop = round((buffer_pop/catchment_pop)*100))

buffer_pop_2km_2019 <- estimate.buffer.pop(
  kasungu_population_2019,
  buffer_2km_2019,
  malaria_pop_by_catchment_2019) %>% 
  dplyr::rename(catchment_pop = pop,
                buffer_pop = buffer_pop_aggregated) %>% 
  dplyr::mutate(
    prop_buffer_catchment_pop = round((buffer_pop/catchment_pop)*100))

buffer_pop_3km_2019 <- estimate.buffer.pop(
  kasungu_population_2019,
  buffer_3km_2019,
  malaria_pop_by_catchment_2019) %>% 
  dplyr::rename(catchment_pop = pop,
                buffer_pop = buffer_pop_aggregated) %>% 
  dplyr::mutate(
    prop_buffer_catchment_pop = round((buffer_pop/catchment_pop)*100))

# 2020 buffer population -------------------------------------------------------
buffer_pop_1km_2020 <- estimate.buffer.pop(
  kasungu_population_2020,
  buffer_1km_2020,
  malaria_pop_by_catchment_2020) %>% 
  dplyr::rename(catchment_pop = pop,
                buffer_pop = buffer_pop_aggregated) %>% 
  dplyr::mutate(
    prop_buffer_catchment_pop = round((buffer_pop/catchment_pop)*100))

buffer_pop_2km_2020 <- estimate.buffer.pop(
  kasungu_population_2020,
  buffer_2km_2020,
  malaria_pop_by_catchment_2020) %>% 
  dplyr::rename(catchment_pop = pop,
                buffer_pop = buffer_pop_aggregated) %>% 
  dplyr::mutate(
    prop_buffer_catchment_pop = round((buffer_pop/catchment_pop)*100))

buffer_pop_3km_2020 <- estimate.buffer.pop(
  kasungu_population_2020,
  buffer_3km_2020,
  malaria_pop_by_catchment_2020) %>% 
  dplyr::rename(catchment_pop = pop,
                buffer_pop = buffer_pop_aggregated) %>% 
  dplyr::mutate(
    prop_buffer_catchment_pop = round((buffer_pop/catchment_pop)*100))

Mapping proportion of catchment population living within waterbodies

# Function to create maps of proportion of people living in proximity ----------
# to water bodies in each catchment area
create.pop.proportion.map <- function(pop.data, 
                                      variable = "prop_buffer_catchment_pop", 
                                      title = NA, 
                                      legend.title = NA){
 
  # pop.data: sf polygon object containing proportion of catchment population 
  #           living within water bodies
  # variable: variable name (as character, in qoutes)
  # title: map title in quotes
  # legend.title: legend title in qoutes
  # returns:
  #   a tmap-element (plots a map)
 
  tm_shape(pop.data)+
    tm_fill(col = variable, 
            breaks = c(0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100),
            palette = "YlOrBr",
            title = legend.title)+
    tm_borders(lw = 0.3)+
    tm_layout(legend.position = c(0.8,"bottom"),
              legend.text.size = 0.5,
              legend.title.size = 0.7,
              frame = FALSE)+
    tm_credits(title, 
               position = c(0.25, 0.75), 
               size = 1)
}

# Invoking function 
# 2017 population proportion ---------------------------------------------------
pop_proportion_1km_2017_map <- create.pop.proportion.map(
  buffer_pop_1km_2017, 
  title = "2017",
  legend.title = "Population within \n1km buffers (%)")

pop_proportion_2km_2017_map <- create.pop.proportion.map(
  buffer_pop_2km_2017, 
  title = "2017",
  legend.title = "Population within \n2km buffers (%)")

pop_proportion_3km_2017_map <- create.pop.proportion.map(
  buffer_pop_3km_2017,
  title = "2017",
  legend.title = "Population within \n3km buffers (%)")

# 2018 population proportion ---------------------------------------------------
pop_proportion_1km_2018_map <- create.pop.proportion.map(
  buffer_pop_1km_2018,
  title = "2018",
  legend.title = "Population within \n1km buffers (%)")

pop_proportion_2km_2018_map <- create.pop.proportion.map(
  buffer_pop_2km_2018,
  title = "2018",
  legend.title = "Population within \n2km buffers (%)")

pop_proportion_3km_2018_map <- create.pop.proportion.map(
  buffer_pop_3km_2018,
  title = "2018",
  legend.title = "Population within \n3km buffers (%)")

# 2019 population proportion ---------------------------------------------------
pop_proportion_1km_2019_map <- create.pop.proportion.map(
  buffer_pop_1km_2019,
  title = "2019",
  legend.title = "Population within \n1km buffers (%)")

pop_proportion_2km_2019_map <- create.pop.proportion.map(
  buffer_pop_2km_2019,
  title = "2019",
  legend.title = "Population within \n2km buffers (%)")

pop_proportion_3km_2019_map <- create.pop.proportion.map(
  buffer_pop_3km_2019,
  title = "2019",
  legend.title = "Population within \n3km buffers (%)")

# 2020 population proportion ---------------------------------------------------
pop_proportion_1km_2020_map <- create.pop.proportion.map(
  buffer_pop_1km_2020,
  title = "2020",
  legend.title = "Population within \n1km buffers (%)")

pop_proportion_2km_2020_map <- create.pop.proportion.map(
  buffer_pop_2km_2020,
  title = "2020",
  legend.title = "Population within \n2km buffers (%)")

pop_proportion_3km_2020_map <- create.pop.proportion.map(
  buffer_pop_3km_2020,
  title = "2020",
  legend.title = "Population within \n3km buffers (%)")

# Layout maps ------------------------------------------------------------------
tmap::tmap_arrange(pop_proportion_1km_2017_map, pop_proportion_2km_2017_map, 
                   pop_proportion_3km_2017_map, pop_proportion_1km_2018_map,
                   pop_proportion_2km_2018_map, pop_proportion_3km_2018_map,
                   pop_proportion_1km_2019_map, pop_proportion_2km_2019_map,
                   pop_proportion_3km_2019_map, pop_proportion_1km_2020_map,
                   pop_proportion_2km_2020_map, pop_proportion_3km_2020_map, ncol = 3)
Fig 8. Proportion of catchment population living around water bodies

Fig 8. Proportion of catchment population living around water bodies

Scatter plots of SMR against the proportion of the catchment population living waterbody buffers

# Function to tidy and bind the SMR and proportion of --------------------------
# buffer-catchment population data frames
tidy.data <- function(smr.df, 
                      proportion.pop.1km, 
                      proprotion.pop.2km,
                      proportion.pop.3km){

# Convert the sf object to dataframes-------------------------------------------
smr_df <- as.data.frame(smr.df) %>% 
  dplyr::select(rowID, Names, SMR)

proportion_pop_1km_df <- as.data.frame(proportion.pop.1km) %>% 
  dplyr::select(rowID, prop_pop_1km = `prop_buffer_catchment_pop`)

proportion_pop_2km_df <- as.data.frame(proprotion.pop.2km)%>% 
  dplyr::select(rowID, prop_pop_2km = `prop_buffer_catchment_pop`)

proportion_pop_3km_df <- as.data.frame(proportion.pop.3km)%>% 
  dplyr::select(rowID, prop_pop_3km = `prop_buffer_catchment_pop`)

# Merge SMR and population data frames -----------------------------------------
combined_1 <- merge(smr_df, proportion_pop_1km_df, by = "rowID", all = TRUE)

combined_2 <- merge(proportion_pop_2km_df, proportion_pop_3km_df)

combined_fully <- merge(combined_1, combined_2, by = "rowID", all = TRUE)

}

# Invoking the function --------------------------------------------------------
smr_pop_2017 <- tidy.data(SMR_2017, buffer_pop_1km_2017, buffer_pop_2km_2017, buffer_pop_3km_2017)

smr_pop_2018 <- tidy.data(SMR_2018, buffer_pop_1km_2018, buffer_pop_2km_2018, buffer_pop_3km_2018)

smr_pop_2019 <- tidy.data(SMR_2019, buffer_pop_1km_2019, buffer_pop_2km_2019, buffer_pop_3km_2019)

smr_pop_2020 <- tidy.data(SMR_2020, buffer_pop_1km_2020, buffer_pop_2km_2020, buffer_pop_3km_2020)

# 2017 scatter plots -----------------------------------------------------------
scatter_1km_2017 <- ggplot2::ggplot(
  smr_pop_2017, 
  aes(x = prop_pop_1km, 
      y = SMR))+
  geom_point()+
  theme_classic()+
  stat_smooth(method = 'lm',
              col = "#C42126")+
  labs(x = "Percentage of catchment population \nliving in 1km buffer",
       y = "SMR",
       title = "2017")

scatter_2km_2017 <- ggplot2::ggplot(
  smr_pop_2017, 
  aes(x = prop_pop_2km, 
      y = SMR))+
  geom_point()+
  theme_classic()+
  stat_smooth(method = 'lm',
              col = "#C42126")+
  labs(x = "Percentage of catchment population \nliving in 2km buffer",
       y = "SMR")

scatter_3km_2017 <- ggplot2::ggplot(
  smr_pop_2017, 
  aes(x = prop_pop_3km, 
      y = SMR))+
  geom_point()+
  theme_classic()+
  stat_smooth(method = 'lm',
              col = "#C42126")+
  labs(x = "Percentage of catchment population \nliving in 3km buffer",
       y = "SMR")

# 2018 scatter plots -----------------------------------------------------------
scatter_1km_2018 <- ggplot2::ggplot(
  smr_pop_2018, 
  aes(x = prop_pop_1km, 
      y = SMR))+
  geom_point()+
  theme_classic()+
  stat_smooth(method = 'lm',
              col = "#C42126")+
  labs(x = "Percentage of catchment population \nliving in 1km buffer",
       y = "SMR",
       title = "2018")

scatter_2km_2018 <- ggplot2::ggplot(
  smr_pop_2017, 
  aes(x = prop_pop_2km, 
      y = SMR))+
  geom_point()+
  theme_classic()+
  stat_smooth(method = 'lm',
              col = "#C42126")+
  labs(x = "Percentage of catchment population \nliving in 2km buffer",
       y = "SMR")

scatter_3km_2018 <- ggplot2::ggplot(
  smr_pop_2018, 
  aes(x = prop_pop_3km, 
      y = SMR))+
  geom_point()+
  theme_classic()+
  stat_smooth(method = 'lm',
              col = "#C42126")+
  labs(x = "Percentage of catchment population \nliving in 3km buffer",
       y = "SMR")

# 2019 scatter plot ------------------------------------------------------------
scatter_1km_2019 <- ggplot2::ggplot(
  smr_pop_2019, 
  aes(x = prop_pop_1km, 
      y = SMR))+
  geom_point()+
  theme_classic()+
  stat_smooth(method = 'lm',
              col = "#C42126")+
  labs(x = "Percentage of catchment population \nliving in 1km buffer",
       y = "SMR",
       title = "2019")


scatter_2km_2019 <- ggplot2::ggplot(
  smr_pop_2019, 
  aes(x = prop_pop_2km, 
      y = SMR))+
  geom_point()+
  theme_classic()+
  stat_smooth(method = 'lm',
              col = "#C42126")+
  labs(x = "Percentage of catchment population \nliving in 2km buffer",
       y = "SMR")

scatter_3km_2019 <- ggplot2::ggplot(
  smr_pop_2019, 
  aes(x = prop_pop_3km, 
      y = SMR))+
  geom_point()+
  theme_classic()+
  stat_smooth(method = 'lm',
              col = "#C42126")+
  labs(x = "Percentage of catchment population \nliving in 3km buffer",
       y = "SMR")

# 2020 scatter plots -----------------------------------------------------------
scatter_1km_2020 <- ggplot2::ggplot(
  smr_pop_2020, 
  aes(x = prop_pop_1km, 
      y = SMR))+
  geom_point()+
  theme_classic()+
  stat_smooth(method = 'lm',
              col = "#C42126")+
  labs(x = "Percentage of catchment population \nliving in 1km buffer",
       y = "SMR",
       title = "2020")

scatter_2km_2020 <- ggplot2::ggplot(
  smr_pop_2020, 
  aes(x = prop_pop_2km, 
      y = SMR))+
  geom_point()+
  theme_classic()+
  stat_smooth(method = 'lm',
              col = "#C42126")+
  labs(x = "Percentage of catchment population \nliving in 2km buffer",
       y = "SMR")

scatter_3km_2020 <- ggplot2::ggplot(
  smr_pop_2020, 
  aes(x = prop_pop_3km, 
      y = SMR))+
  geom_point()+
  theme_classic()+
  stat_smooth(method = 'lm',
              col = "#C42126")+
  labs(x = "Percentage of catchment population \nliving in 3km buffer",
       y = "SMR")

# Create a list of plots -------------------------------------------------------
plot_lists <- list(scatter_1km_2017, scatter_2km_2017, scatter_3km_2017,
                   scatter_1km_2018, scatter_2km_2018, scatter_3km_2019,
                   scatter_1km_2019, scatter_2km_2019, scatter_3km_2019, 
                   scatter_1km_2020, scatter_2km_2020, scatter_3km_2020)

# Arrange the plots ----------------------------------------------------------
grid.arrange(grobs = plot_lists, ncol = 3)
Fig 9. Relationship between standardised morbidity ratio and living near waterbodies

Fig 9. Relationship between standardised morbidity ratio and living near waterbodies